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Abstract — The optimal decoder achieving the outage capacity 
under imperfect channel estimation is investigated. First, by 
searching into the family of nearest neighbor decoders, which 
can be easily implemented on most practical coded modulation 
systems, we derive a decoding metric that minimizes the average 
of the transmission error probability over all channel estima- 
tion errors. This metric, for arbitrary memoryless channels, 
achieves the capacity of a composite (more noisy) channel. Next, 
according to the notion of estimation-induced outage capacity 
(EIO capacity) introduced in our previous work, we characterize 
maximal achievable information rates associated to the proposed 
decoder. The performance of the proposed decoding metric over 
uncorrelated Rayleigh fading MIMO channels is compared to 
both the classical mismatched maximum-likelihood (ML) decoder 
and the theoretical limits given by the EIO capacity (i.e. the best 
decoder in presence of channel estimation errors). Numerical 
results show that the derived metric provides significant gains, in 
terms of achievable information rates and bit error rate (BER), in 
a bit interleaved coded modulation (BICM) framework, without 
introducing any additional decoding complexity. 

I. Introduction 

Consider practical wireless communication systems, where 
each receiver disposes only of noisy channel estimates that 
may in some circumstances be poor estimates, and these 
estimates are not available at the transmitter This constraint 
constitutes a practical concern for the design of such commu- 
nication systems that, in spite of their knowledge limitations, 
have to ensure communications with a prescribed quality of 
service (QoS). This QoS requires to guarantee communications 
with a given target information rate and small error probability, 
no matter which degree of accuracy estimation arises during 
the transmission. The described scenario addresses two impor- 
tant questions: (i) What are the theoretical limits of reliable 
transmission rates, using the best possible decoder in presence 
of imperfect channel state information at the receiver (CSIR) 
and (ii) how those limits can be achieved by using practical 
decoders in coded modulation systems ? Of course, these 
questions are strongly related to the notion of capacity that 
must take into account the above mentioned constraints. 

Recently in [1], we have addressed the first question (i) 
for general memoryless channels, by introducing the notion 
of Estimation-induced outage capacity (EIO capacity). Basi- 
cally, we consider that a specific instance of the unknown 
memoryless channel, with input x £ ^ and output G is 
characterized by a transition probability ly 8) G We with 



an unknown channel state 6, which follows i.i.d. 9 ^ i/'(6'); 
We is a family of conditional pdf parameterized by the vector 
of parameters G 8 C C*. The receiver only knows an 
estimate and a characterization of its quality, in terms of 
the conditional pdf tl'{9\9). A decoder using 9, instead of 9, 
obviously might not support an information rate R (even small 
rates might not be supported if 9 and 9 are strongly different). 
Consequently, outages induced by channel estimation errors 
(CEE) will occur with a certain probability 7q„s. 

The second question (ii) concerning the derivation of a 
practical decoder that, using imperfect channel estimation, 
can achieve information rates closed to the EIO capacity is 
addressed in this paper. Classically, to deal with imperfect 
channel state information (CSl) one sub-optimal technique, 
known as mismatched maximum-likehood (ML) decoding, 
consists in replacing the exact channel by its estimate in the 
decoding metric. However, this scheme is not adapted to the 
presence of CEE, at least for systems with small training 
overhead. As an alternative to this, Tarokh et al. [2] and 
Taricco and Biglieri [3], proposed an improved ML detection 
metric and applied it to a space-time coded MIMO system. 
This metric can be formally derived as a special case of the 
general framework presented here. In this paper, according to 
the notion of EIO capacity we derive the general expression 
of a decoder that minimizes the average of the transmission 
error probability over all CEE and consequently it achieves 
the capacity of a composite (more noisy) channel. Then, 
we evaluate this for Rayleigh fading MIMO channels and 
investigate maximal achievable information rates. 

A. A Brief Review of Estimation-induced Outage Capacity 

A message m G = {1, . . . , [exp(ri,_R)J } is transmitted 
using a pair [ip, (f) of mappings, where tp : Jv[ ^ j^T" is 
the encoder, and : '3/"^'' x is the decoder (that 

utilizes 9). The random rate, which depends on the unknown 
channel realization 9 through its probabihty of error, is given 
by nT^ \ogMg g. The maximum error probability 

e^^^{ip,(bj;9)^ maxW{{(b{yJ)^m}\ip{m),9), (1) 

For a given channel estimate 9, and < e, 7^^^^ < 1, an outage 
rate i? > is (e, 7Q^g)-achievable if for every 6 > and every 
sufficiently large n there exists a sequence of length-n block 
codes such that the rate satisfies 



'r(A,(i?,^)|^) 



dV-C^I^) >1-7q„s, (2) 



: n-^ log Mg g > R- 6}, and 
9) < e} is the set of all channel 



where A,(-R,6') = {6i G A, 

A, = {6' e 9: emax(¥','/',6 
states allowing for reliable decoding. This definition requires 
that maximum error probabilities larger than e occur with 
probability less than 7q„s. The practical advantage of such 
definition is that for (1 — 7q„s)% of estimates, the transmitter 
and receiver strive to construct codes for ensuring the desired 
communication service. The EIO capacity is then defined as 
the largest (e, 7^ -achievable rate, for an outage probability 
Jq^s ^iid ^ given estimated 6, as 



since the maximization in ^ by using (pxi is not really an 
explicit function of V. 

Instead of trying to find an optimal decoding metric mini- 
mizing the transmission error probability ([T]i for every 9 G A*, 
we propose to look at the decoding metric minimizing the 
average of this error probability over all CEE. This means. 



max 



sup 

Ace: Pr(A|e)>l-7 



inf/(P,M/(.|,0)), 

'QoS 

where /(•) denotes the mutual information of the channel 
W{y\x,9) and ^r(>^) is the set of input distributions not 
depending on 9. The theoretical decoder achieving the capacity 
Q, based on the well-known method of typical sequences, 
cannot be implemented on practical communication systems. 
Indeed in [4], the achievable rates obtained with the mis- 
matched ML decoding have been showed to be largely smaller 
compared to the EIO capacity. 

B. A Practical Decoder Using Channel Estimation Accuracy 

We now consider the problem of deriving a practical de- 
coder that achieves the capacity Assume that we limit 
the searching of decoding functions (f) to the class of additive 
decoding metrics, which can be implemented on realistic 
systems. This means that for a given channel output y = 
(2/1, . . . , we set the decoding function 

^viyj) = arg min 2?"(^(m),y|0), (4) 

where P"(x,y|^) = T,Etl'D{x^^y^\^) and P : JT x 
X 6 1-^ M>o is an arbitrary per-letter additive metric. 
Consequently, the maximization in (O is actually equivalent to 
maximizing over all decoding metrics V. However, we note 
that this restriction does not necessarily lead to an optimal 
decoder achieving the capacity. 

In order to find the optimal decoding metric V maximizing 
the outage rates, for a given probability 7^^^ and estimate 6, it 
is necessary to look at the intrinsic properties of the capacity 
definition. Observe that the size of the set of all channel 
states allowing for reliable decoding A^ is determined by the 
decoding function (p chosen and the maximal achievable rate 
R, constrained to the outage probability (|2]i, is then limited by 
this size. Thus, for a given decoder cf), there exists an optimal 
set A* C Ac of channel states with conditional probability 
larger than 1 — 7qoS' providing the largest achievable rate, 
which follows as the minimal instantaneous rate for the worst 
6 G A*. The optimal set A* is equal to the set A* maximizing 
the expression (|3]l. Hence, an optimal decoding metric must 
guarantee minimum error probability ([T]i for every 6 E A* . 
Then, the computation of such metric becomes very difficult. 



V 



M 



arg mm 
V 



(5) 



where e'^l^ 



follows by replacing (|4]i in ([T]i. Actually, for n 
sufficiently large, this optimization problem can be resolved 
by setting 

VM{x,y\e)^-logWiy\x,9), (6) 

W{y\x,9) = jQW{y\x,9)dijj{9\9) is the channel resulting 
from the average of the unknown channel over all CEE, given 
the estimate 6. Here we do not go into the details of how the 
optimal metric (|6]l minimizes (|5]l. Basically, the average of the 
transmission error probability leads to the composite (more 
noisy) channel W{y\x,d), and then we take the logarithm of 
this composite channel to obtain its ML decoder 

In the remainder of this paper, we evaluate the derived 
decoding metric Q for uncorrected Rayleigh fading MIMO 
channels and use it in a bit interleaved coded modulation 
(BICM) receiver (section II). Then, we compute the achievable 
rates according to the considered notion of the EIO capacity 
(section III). In section IV, we illustrate via numerical simula- 
tions the performance of the improved decoder and compare 
it to the mismatched ML decoder. 

II. Channel Model, Decoding with imperfect CSIR 
AND Receiver Processing 

We use upper case and lower case boldface letter for matrix 
and vectors, respectively; || ■ \\p denotes the Frobenius norm, 
diag(x) denotes a diagonal matrix with elements x, diag(H) 
denotes the vector corresponding to the diagonal elements of 
the matrix H and {-Y the Hermitian transposition. 

A. MIMO Channel model 

Consider a single-user memoryless Fading MIMO channel 
with Mt transmitter and Mfj, receiver antennas. The discrete- 
time channel at time t is modeled by 

y(i) =H(t)x(t)+z(t), (7) 
where x(t) G ^a/txi vector of transmitter symbols 

and y{t) e {^Mrxi jg (jjg vector of received symbols; 
9 = H(i) e (C^'r'>^^'t is the complex random matrix 
whose entries are i.i.d. zero-mean circularly symmetric com- 
plex Gaussian (ZMCSCG) random variables CA/'(0,cr^). The 
channel is a complex normal distributed matrix H(t) ~ ■0(0) = 
CAf{Q^lMT ® ^h), with T,H = (^'jj^AiR- The noise vector 



z(t) e 



consists in ZMCSCG random vector with 



covariance matrix Sq = g^Imh- This leads to a channel 
VK(y|x,H) = CA/'(Hx, Eg). The input symbols are con- 
strained to satisfy (Ex(x(t)x(t)^)) < P. 

Channel estimation: We assume that the transmitter, be- 
fore sending the data x, can teach the channel to the re- 
ceiver by sending a training sequence of N vectors = 



(xt,1: ■ • ■ ,xt.7v)- We assume that the coherence time of the 
channel is much longer than the training time and the average 
energy of the training symbols is Pt = jy|^_^ fr(XTXj). 
This sequence is affected by the channel matrix H, allowing 
the receiver to perform ML estimation of H from the observed 
signals Yt = HXt + Zt and Xt- This yields to 6* = H = 
H + where £ denotes the estimation error matrix yielding 
to a white error matrix = <^£^Mn ™d ~ SNR^^ with 
SNRt = , when the training sequences are orthogonal. 

^ z 

Mismatched ML decoder: The classical mismatched ML 
decoder consists of the likelihood function of the channel using 
the channel estimate H, Pml(x, y|H) — — log VF(y|x, H). 
This leads to the following Euclidean distance 

I?ML(x,y|H) = ||y-Hxf+ const. (8) 

B. Metric computation 

We now evaluate the general metric expression of (|6]l in 
the case of a MIMO channel model dTji. To this end, we first 
derive the pdf ?/'(6'|6'), which can be obtained by using the 
likelihood function, the pdf VF(y|x, H), and V'(^)- Then, by 
averaging the channel VF(y|x, H) over all CEE and after some 
algebra, we obtain the channel VF(y|x, H) = CA/'((5Hx, E0 + 

(JEf ||x|p) where b = snrtj^+i ' Pii^^lly' from Q the optimal 
decoding metric for this channel is reduced to 

PA,(x,y|H) = il/Hlog(a| + <5a^||xf ) + 'i^'/^^j' 

a 2 + oa^ ||X|| 

(9) 

and this metric coincides with that proposed for space-time 
decoding, from independent results in [2] and [3]. 

C. Receiver structure 

The problem of decoding MIMO-BICM has been addressed 
in [5] under the assumption of perfect CSIR. Here we consider 
the same problem with CEE, for which we use the metric (|9]l 
to the iterative decoding process of BICM. Basically, the re- 
ceiver consists of the combination of two sub-blocks operating 
successively. The first sub-block, referred to as soft symbol to 
bit MIMO demapper, produces bit metrics (probabilities) from 
the input symbols and the second one is a soft-input soft-output 
(SISO) trellis decoder. Each sub-block can take advantage of 
the a posteriori (APP) provided by the other sub-block as an 
additive information. Here, SISO decoding is performed using 
the well known forward-backward algorithm [6]. We recall the 
formulation of the soft MIMO detector. 

Suppose first the case where the channel matrix H^- is 
perfectly known at the receiver. The MIMO demapper provides 
at its output the extrinsic probabilities on coded and interleaved 
bits d. Let dk,i, i = 1,...,BMt, be the interleaved bits 
corresponding to the fc-th compound symbol x^ G Q where 
the cardinality of Q is ||Q|| = 2^^^'^ . The extrinsic probability 
Pdcm{dk,j) of the bit dkj (bit metrics) at the MIMO demapper 
output is calculated as 

BMt 

Pdem{dk,j = 1) = -ft: ^ Y[ -Pdec(rfj)exp [-P(xfc , yfc |Hfe )] 



li=l It^J 



where K is the normalization factor satisfying Pdcm{dk,j = 
1) + Pdcm{dk,j = 0) = 1 and Pdoc(dfcj) is the prior 
information on bit dkj, coming from the SISO decoder. The 
summation in ( fTOl i is taken over the product of the channel 
likelihood given a compound symbol Xfe, and the a priori 
probability on this symbol (the term Yl Pdcc) feedback from 
the SISO decoder at the previous iteration. Concerning this 
latter term, the a priori probability of the bit dkj itself 
has been excluded, so as to let the exchange of extrinsic 
information between the channel decoder and the MIMO 
demapper. At the first iteration we set Pdcc(rffe.i) = 1/2. 
Notice that by reglacing the unknown channel involved in dTol i 
by its estimate H^, we obtain the mismatched ML decoder 
of MIMO-BICM. Instead of this, we introduce in ([TOll the 
demaping rule Vm dU, which is adapted to the CEE. 

III. Computation of Achievable Information Rates 

We now derive the achievable rates Cp associated to a 
receiver using the decoding rule (|4]i, based on the derived 
metric (|9]l. This is done by using the following Theorem [7], 
for the considered channels VF(y|x, H) = CA/'(Hx, So). 

Theorem 3.1: For any pair of matrices (H,H), the max- 
imal achievable rate associated to a receiver using a metric 
P(x, y|H) is given by 

Ci,(H,H)= sup inf I{Px,Vyix), (H) 

where the mutual information functional I{Px, Vy\x) = 

Fy|x(y|x,T) 

J Vy|x(y|x',T)dPx(x') 

(12) 

and V(H, H) denotes the set of test channels, i.e., all possibles 
uncorrected MIMO channels Vr|x(y|x, T) = CA/'(Tx, S), 
verifying that 

(ci) : tr{Ep {Ey{YY^}}) = tr{Ep {Ew{YY^}}), 
(C2) : Ep {Ey{2?(x,y|H)}} < Ep {Ei^'{P(x,y|H)}}. 

Computation of achievable rates: In order to solve the 
constrained minimization problem in Theorem ( 13. U for our 
metric T> ~ Vm (expression (|9]l), we must find the channel 
T e £MrxMt the covariance matrix S = lA/„cr^ 
defining the test channel VV|x(y|x, T) that minimizes the 
mutual information fT% . On the other hand, through this 
paper we assume that the transmitter does not dispose of 
the channel estimates, and consequently no power control is 
possible. Thus, we choose the sub-optimal input distribution 
Px = CJ\f{0, Sp) with Sp = ImtP- We first compute the 
constraint set V(H,H), given by (ci) and (C2), and then we 
factorize the matrix H to solve the minimization problem. 
Before this, we state the following result that, due to lack of 
space, we do not include it in this paper. 

Lemma 3.2: Let A g ([^MrxMt arbitrary matrix and 

X be a random vector with pdf CAf{0, Sp). For every real 
positive constants Ki , K2 > 0, the following equality holds 

n+l 



l0g2 



-dPx(x)dlV|x(y|x,T), 



E> 



(10) 
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exp 
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T(~n,K2/P), n 



Ml 
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and T{-n,t) 



{-ly 



r(o,t) 



4)' 



cxp(-t) E \ t 

4=0 

r(0,i) denotes the exponential integral function. 

Then, by using Lemma 13.21 and some algebra, we have that 



and 



(ci) 

(C2) 

where aM 

A„a| - SalP] 
P-'{trCEo] 

An = 



ir(TI]pT^ + S) = tr(HSpHt + So), (13) 



\T + aMii\\F < ||H + aA^H|| 



C, (14) 



5{dajP 
C = 



- MS))][i - 
exp ( ^] r 



A/tA„[||H|||, - ||T|||, + 
A„ - A/tAk] and 



(5P 



From expression ( fT4] i and computing the mutual informa- 
tion, the minimization in ( fTTT ) writes as C^™'^(H, H) = 



rmn log2 det (Im 



subject to II T 



TSpTtS-i) , 
a^H||2 < ||H + a^H| 



C, 



(15) 

where S must be chosen such that fr(TSpT^^ + S) = 
tr(HSpH^^+So). In order to obtain an alternative expression 
of ( fTSl l, simpler and more tractable, we consider the following 
decomposition of the matrix H = Udiag(A)V^^ with A = 
(Ai, . . . , Xme )'^ ■ Let diag(/i) be a diagonal matrix such that 
diag(^) = U^^TV, whose diagonal values are given by the 
vecto7/i = {i^Li,...,fiMnV- We define & = V^&V, 
the vector = diag(H^^)^ resulting of its diagonal and let 



|H + a^H|||- 



(l|H||^- 



h|| ). Using the above 



definitions and some algebra, the optimization ( flSl l writes 



MIMO 

M 



(H,H) 



mm 



Mr 



subject to 11^ 

(16) 

with (T^(^) = a£;:(||A|P - IImP) + o'l- The constraint set 
in the minimization ( fTSb . which corresponds to the set of 
vectors e ^a/txi . ||^ _|_ a^h||2 < Bm}, is a closed 
convex polyhedral set. Thus, the infimun in ( fT6l ) is attainable 
at the extremal of the set given by the equality (cf. [8]). On 
the other hand, for every vector fi such that ||/i|p < ||Ap, 
we observe that the expression IS a monotone mcreasmg 
function of the square norm of 1^1. As a consequence, it is 
sufficient to find the optimal vector /i^' minimizing the square 
norm over the constraint set. This becomes a classical convex 
minimization problem that can be easily solved by using 
Lagrange multipliers. The corresponding achievable rates are 
then given by 



C 



MIMO/ 
M 



^(H,H)=l0g2det [iMn 

where the optimal solution Topt 



ToptSpTopjCr 



Udiag(^^')Vt with 



(17) 







if hM > 0, 
otherwise, 



(18) 



For any pair of matrices (H, H), the expression dTTl ) 
provides the instantaneous achievable rates associated to a 
receiver using the decoding rule based on the derived 
metric (|9]). Whereas, the outage rates R> must be computed 
by using the associated outage probability P™'. This outage 
probability is defined by (|2]i, where here Ax)(i?, H) ~ {H : 
Cd(H,H) < i?}. Therefore, the maximization of the outage 
rate ([3]), for an outage probability 7q„s, is given by 

Cp(7«„,, H) = sup {i? > : (i?, H) < 7^„, }. (19) 

Before conclude this section, following the same steps as 
above, we can compute the achievable rates associated to the 
mismatched ML decoder ([8]). These are given by replacing in 
expression ( [TtI i the solution vector 



£-ML 



le{tr(Ath)} ~ 
llhIP 



(20) 



IV. Numerical Results 

In this section we provide numerical results to analyze the 
performance of a receiver using the derived metric (|9]l, over 
uncorrected block fading MIMO channels. The performances 
are measured in terms of BER and achievable outage rates. 
The binary information data are encoded by a rate 1/2 
non-recursive non-systematic convolutional channel code with 
constraint length 3 defined in octal form by (5,7). Throughout 
the simulations, each frame is assumed to consists of 100 
MIMO symbols belonging to a 16-QAM constellation with 
Gray labeling. The interleaver is a random one operating over 
the entire frame with size 100 • Af^ • log2(P) bits. Although 
longer interleavers are expected to yield somewhat improved 
performance, this length was choosed because of latency re- 
quirements. For each transmitted frame, a different realization 
of the channel has been drawn and remains constant during 
the whole frame. Besides, it is assumed that the average pilot 
symbol energy is equal to the average data symbol energy. 

Fig-H shows the increase in required Ei,/Nq caused by the 
CEE for a 2 X 2 MIMO channel in the case of mismatched ML 
decoding. We insert = 2, 4 or 8 pilot symbols per frame for 
CSIR acquisition. At BER = 10"^ and = 2, we observe 
about 2 dB of SNR gain by using the improved decoder. 
We also notice that the performance loss of the mismatched 
receiver with respect to the derived receiver becomes insignif- 
icant for N > 8. This can be explained from the expression of 
the metric (|9]l, where we note that by increasing the number of 
pilot symbols, this expression tends to the classical Euclidean 
distance metric (see equation (O). This clearly shows that the 
investigated decoder outperforms the mismatched decoder. 

Fig. |2] compares average outage rates over all channel esti- 
mates, of both mismatched ML decoding (given by expression 
([TtI i and (|20] |) and our decoding metric (given by ( fTTI i and 
(fTSI l) versus the SNR. The 2x2 MIMO channel is estimated by 
sending 2 pilot symbols per frame. The outage probability has 
been fixed to 7^^^, = 0.01. For comparison, we also display 
the upper bounds on these achievable outage rates, i.e. the 
EIO capacity (obtained by evaluating (O) and the ergodic 
capacity with perfect channel knowledge at the decoder. It 



can be observed that the achievable rate using the mismatched 
ML decoding is about 5 dB (at a mean outage rate of 6 bits) 
of SNR far from the EIO capacity. Also, we note that the 
investigated decoder achieves higher rates for any SNR values 
and decreases by about 1.5 dB the aforementioned SNR gap. 

Similar plots are shown in Fig. [3] for a 4 x 4 MIMO 
channel estimated with iV = 4. Again, it can be observed 
that the modified decoder achieves higher rates than the 
mismatched decoder. However, the performance degradation 
of the mismatched compared to the improved decoder has 
decreased to less than IdB (at 10 bits). This is a consequence 
of using orthogonal training sequences with N > Mt and the 
fact that channel estimation is improved by increasing number 
of antennas [9]. 

V. Conclusion 

This paper studied the problem of reception in practical 
communication systems, when the receiver has only access to 
a noisy estimate of the channel and this is not available at the 
transmitter. By minimizing the average of the transmission 
error probability over all channel estimation errors, we de- 
rived an improved decoder adapted to the imperfect channel 
estimation. Although we showed that the proposed decoder 
outperforms the classical mismatched approach, the derivation 
of a practical decoder achieving the EIO capacity (maximizing 
over all possible decoders) remains as an open problem. 

We also derived the expression of the achievable rates 
associated to the improved decoder and compare these to the 
classical mismatched ML decoding, which replaces the perfect 
channel by its imperfect estimate. As a practical application, 
the improved decoder is used for iterative BICM decoding of 
MIMO under imperfect channel knowledge. Simulation results 
over Rayleigh block fading MIMO channels indicate that 
mismatched ML decoding is sub-optimal, in terms of BER and 
achievable rates for short training sequences, and confirmed 
the adequacy of the improved decoder This performance 
improvement was obtained without introducing any additional 
complexity. 
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Fig. 1. BER performance of 2 X 2 MIMO of the proposed decoder over 
Rayleigh fading channel for various training sequence lengths. 
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Expected outage rates of 2 X 2 MIMO system versus SNR {N = 2). 
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Fig. 3. Expected outage rates of 4 X 4 MIMO system versus SNR (N = 4). 



